Missing Generalizations: A Supervised Machine Learning Approach to L2 Written Production

نویسندگان

  • Daniel Wiechmann
  • Elma Kerz
چکیده

Recent years have witnessed a growing interest in usage-based models of language, which characterize linguistic knowledge in terms of emerging generalizations derived from experience with language via processes of similarity-based distributional analysis and analogical reasoning. Language learning then involves building the right generalizations, i.e. the recognition and recreation of the statistical regularities underlying the target language. Focusing on the domain of relativization, this study examines to what extent the generalizations of advanced second language learners pertaining to the usage of complex constructions differ from those of experts in written production. We approach this question through supervised machine learning employing as a primary modeling tool random forests with conditional inference trees as base learners.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Collective Kernel Construction in Noisy Environment

Kernels are similarity functions, and play important roles in machine learning. Traditional kernels are built directly from the feature vectors of data instances xi, xj . However, data could be noisy, and there are missing values or corrupted values in feature vectors. In this paper, we propose a new approach to build kernel Collective Kernel, especially from noisy data. We also derive an effic...

متن کامل

Emotion Detection in Persian Text; A Machine Learning Model

This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...

متن کامل

Imputation of Missing Data Using Machine Learning Techniques

A serious problem in mining industrial data bases is that they are often incomplete, and a significant amount of data is missing, or erroneously entered. This paper explores the use of machine-learning based alternatives to standard statistical data completion (data imputation) methods, for dealing with missing data. We have approached the data completion problem using two well-known machine le...

متن کامل

Missing Value Imputation Using a Semi-supervised Rank Aggregation Approach

One relevant problem in data quality is the presence of missing data. In cases where missing data are abundant, effective ways to deal with these absences could improve the performance of machine learning algorithms. Missing data can be treated using imputation. Imputation methods replace the missing data by values estimated from the available data. This paper presents Corai, an imputation algo...

متن کامل

Towards a more efficient representation of imputation operators in TPOT

Automated Machine Learning encompasses a set of meta-algorithms intended to design and apply machine learning techniques (e.g., model selection, hyperparameter tuning, model assessment, etc.). TPOT, a software for optimizing machine learning pipelines based on genetic programming (GP), is a novel example of this kind of applications. Recently we have proposed a way to introduce imputation metho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014